Full “Laplacianised” posterior naive Bayesian algorithm

نویسندگان

  • Hamse Y. Mussa
  • John B. O. Mitchell
  • Robert C. Glen
چکیده

BACKGROUND In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi-class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results. Using clever heuristic arguments "anchored" by insightful cheminformatics knowledge, Xia et al. have simplified the SNB algorithm further and termed it the Laplacian Corrected Modified Naive Bayes (LCMNB) approach, which has been widely used in cheminformatics since its publication.In this note we mathematically illustrate the conditions under which Xia et al.'s simplification holds. It is our hope that this clarification could help Naive Bayes practitioners in deciding when it is appropriate to employ the LCMNB algorithm to classify large chemical datasets. RESULTS A general formulation that subsumes the simplified Naive Bayes version is presented. Unlike the widely used NB method, the Standard Naive Bayes description presented in this work is discriminative (not generative) in nature, which may lead to possible further applications of the SNB method. CONCLUSIONS Starting from a standard Naive Bayes (SNB) algorithm, we have derived mathematically the relationship between Xia et al.'s ingenious, but heuristic algorithm, and the SNB approach. We have also demonstrated the conditions under which Xia et al.'s crucial assumptions hold. We therefore hope that the new insight and recommendations provided can be found useful by the cheminformatics community.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Validation Test Naive Bayesian Classification Algorithm and Probit Regression as Prediction Models for Managerial Overconfidence in Iran's Capital Market

Corporate directors are influenced by overconfidence, which is one of the personality traits of individuals; it may take irrational decisions that will have a significant impact on the company's performance in the long run. The purpose of this paper is to validate and compare the Naive Bayesian Classification algorithm and probit regression in the prediction of Management's overconfident at pre...

متن کامل

Bayesian Quantile Regression with Adaptive Elastic Net Penalty for Longitudinal Data

Longitudinal studies include the important parts of epidemiological surveys, clinical trials and social studies. In longitudinal studies, measurement of the responses is conducted repeatedly through time. Often, the main goal is to characterize the change in responses over time and the factors that influence the change. Recently, to analyze this kind of data, quantile regression has been taken ...

متن کامل

Bayesian Quantile Regression with Adaptive Lasso Penalty for Dynamic Panel Data

‎Dynamic panel data models include the important part of medicine‎, ‎social and economic studies‎. ‎Existence of the lagged dependent variable as an explanatory variable is a sensible trait of these models‎. ‎The estimation problem of these models arises from the correlation between the lagged depended variable and the current disturbance‎. ‎Recently‎, ‎quantile regression to analyze dynamic pa...

متن کامل

‎A Bayesian mixture model‎ for classification of certain and uncertain data

‎There are different types of classification methods for classifying the certain data‎. ‎All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data‎. ‎In recent years‎, ‎by assuming the distribution of the uncertain data is normal‎, ‎there are several estimation for the mean and variance of this distribution‎. ‎In this paper‎, ‎we co...

متن کامل

Improving Naive Bayesian Classifier by Discriminative Training

Discriminative classifiers such as Support Vector Machines (SVM) directly learn a discriminant function or a posterior probability model to perform classification. On the other hand, generative classifiers often learn a joint probability model and then use the Bayes rule to construct a posterior classifier. In general, generative classifiers are not as accurate as discriminative classifiers. Ho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2013